An Efficient Treatment Of Japanese Verb Inflection For Morphological Analysis
نویسندگان
چکیده
Because of its simple appearance, Japanese verb inflection has never been treated seriously. In this paper we reconsider traditional lexical treatments of Japanese verb inflection, and propose a new treatment of verb inflection which uses newlydevised segmenting units. We show that our proposed treatment minimizes the number of lexical entries and avoids useless segmentation. It requires 20 to 40% less chart parsing computation and it is also suitable for error correction in optical character readers. Introduct ion In this paper we focus on lexical entries for coping with Japanese verb inflection. The problem of treating verb inflection comes from the nature of written Japanese, in which word boundaries are not usually indicated explicitly. The morphological analyzer must therefore check for the existence of a verb and its inflection at each position in an input character string. As a consequence, an awkward treatment of verb inflection may result unacceptably low computational efficiency. Japanese verb inflection seems to be quite simple. Therefore, it has never been a central subject of natural language processing (NLP) studies. It is also because, in the early stages of Japanese NLP, the most time-consuming process of the Japanese morphological analysis (JMA) was found to be accessing the dictionary stored in a secondary memory. Therefore greater effort was put into designing the dictionary data structure and methods for quick access. The situation, however, has changed. Highly efficient data structures based on the TRIE structure seem to have finally solved the data structure problems (for instance, Morimoto and Aoe, 1993), and the access problem is also being resolved by the emergence of cheap main memory on which the dictionary can be stored directly, and a dictionary-accessing chip that can access the dictionary thousands of times faster (Fukushima, 1991). As a result, problem of treating Japanese verb inflection is becoming more import,ant. Although phonological description of Japanese verb inflection is highly simple, it cannot be applied to JMA directly. Because each Japanese hiragana phonogram basically corresponds to a consonant-vowel pair, not to a phoneme. On the other hand, traditional school grammar gives a description based on the ordinary Japanese writing system, and has thus been widely used in JMA. However it is neither as rational as the phonological description nor is it the most efficient from a computational viewpoint. We reconsider lexical entries for verb inflection and propose a new method for segmenting verbal complexes. Though our method is based on the ordinary Japanese writing system, it has various advantages over existing ones: 1) it minimizes tile number of lexical entries together with avoiding useless segmentation; 2) it requires 20 to 40% less chart parsing computation, where the parser is based on dynamic programming and suitable for robust analysis; 3) it is also suitable for error correction in OCRs; 4) it requires a smaller incident matrix than other treatments, making the morphological analyzer easier to construct and maintain. Section 1 overviews descriptions of Japanese verb inflection in terms of phonology and in terms of traditional school grammar. Section 2 reviews three different treatments of verb inflection in NLP, which are based on the two descriptions in section 1. Section 3 introduces our proposed treatment, and section 4 shows the advantages of our treatment from several aspects, including a quantitative comparison of the computational efficiency of a chart parser.
منابع مشابه
An Efficient OCR Error Correction Method for Japanese Text Recognition
OCR error correction using Japanese morphological analysis contains two time-consuming procedures: extraction of candidate words from combinations of candidate characters, and finding the most plausible word sequence in combinations of the candidate words. In this paper an optimal word extraction technique, and the use of lexical entries that are tailored for Japanese verb inflection, are inves...
متن کاملA comparison of two theoretically driven treatments for verb inflection deficits in aphasia.
Errors in the production of verb inflections, especially tense inflections, are pervasive in agrammatic Broca's aphasia (*The boy eat). The neurolinguistic underpinnings of these errors are debated. One group of theories attributes verb inflection errors to disruptions in encoding the verb's morphophonological form, resulting from either a general phonological deficit or a morphological affixat...
متن کاملVerbal Inflection in Hindi: A Distributed Morphology Approach
In this paper, we provide a complete description of Hindi verbal inflection within the framework of Distributed morphology. We discuss the categories that are visible on the verb itself and on associated auxiliaries. We show how both analysis and generation are possible using this model. We also discuss the implementation of such linguistically motivated analysis in a morphological analyzer for...
متن کاملAnalysis of Noun, Pronoun and Adjective Morphology for NLization of Punjabi with EUGENE
Morphological analysis of various parts of speech is an important activity in order to design a machine translation system for a language. This paper describes morphological analysis of Punjabi nouns, pronouns and adjectives for developing Universal Networking Language (UNL) based Machine Translation (MT) system for this Language. All headwords which are involved in UNL-to-NL dictionary always ...
متن کاملA selective deficit for inflection production.
We report the case of an English-speaking aphasic patient (JP) with left posterior-frontal damage affecting the inferior frontal and precentral gyri. In speaking, JP was impaired with the regular inflections of nouns and pseudonouns, making errors like "pears" instead of pear or "door" for doors, while the spoken production of noun stems and irregularly inflected nouns (teeth) was preserved. JP...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994